In this document, we intend to investigate the following key questions, assuming a fixed array of \(10^6\) SNPs:

We first focus on a quantitative trait in which SNP effect sizes follow a normal distribution.

1. When is Winner’s Curse a problem?

In this section, we look at the average number of significant SNPs, the average proportion of these significant SNPs that have association estimates more extreme than their true effect size and the average MSE of significant SNPs at two different thresholds; the common genome-wide significance threshold of \(5 \times 10^{-8}\) and a higher threshold of \(5 \times 10^{-4}\). We consider these properties under certain combinations of values for the following parameters:

  1. sample size - n_samples
  2. heritability - h2
  3. polygenicity, i.e. proportion of effect SNPs - prop_effect
  4. selection coefficient - S

The 24 different combinations that we will investigate throughout this document are detailed below:

Scenario n_samples h2 prop_effect S
1 30,000 0.3 0.010 -1
2 300,000 0.3 0.010 -1
3 30,000 0.8 0.010 -1
4 300,000 0.8 0.010 -1
5 30,000 0.3 0.001 -1
6 300,000 0.3 0.001 -1
7 30,000 0.8 0.001 -1
8 300,000 0.8 0.001 -1
9 30,000 0.3 0.010 0
10 300,000 0.3 0.010 0
11 30,000 0.8 0.010 0
12 300,000 0.8 0.010 0
Scenario n_samples h2 prop_effect S
13 30,000 0.3 0.001 0
14 300,000 0.3 0.001 0
15 30,000 0.8 0.001 0
16 300,000 0.8 0.001 0
17 30,000 0.3 0.010 1
18 300,000 0.3 0.010 1
19 30,000 0.8 0.010 1
20 300,000 0.8 0.010 1
21 30,000 0.3 0.001 1
22 300,000 0.3 0.001 1
23 30,000 0.8 0.001 1
24 300,000 0.8 0.001 1

\(~\) \(~\) \(~\)

Running the code provided in nsig_prop_bias_100sim.R, we obtain the following results. Note that prop_x refers to the proportion of significant SNPs which have been found to be significantly overestimated, i.e. those SNPs in which \(\left| \hat\beta_{i} \right| > \left| \beta_{i} \right| + 1.96\cdot\sigma_{i}\).

Scenario n_samples h2 prop_effect S n_sig 5e-8 prop_bias 5e-8 prop_x 5e-8 mse 5e-8 n_sig 5e-4 prop_bias 5e-4 prop_x 5e-4 mse 5e-4 sd(n_sig) 5e-8 sd(prop_bias) 5e-8 sd(prop_x) 5e-8 sd(mse) 5e-8 sd(n_sig) 5e-4 sd(prop_bias) 5e-4 sd(prop_x) 5e-4 sd(mse) 5e-4
1 30,000 0.3 0.010 -1 0.70 1.0000 0.9000 0.001573 610.99 0.9996 0.9165 0.001919 0.745 0.0000 0.2950 0.001234 23.070 0.0008 0.0108 0.000105
2 300,000 0.3 0.010 -1 848.63 0.7619 0.0895 0.000022 3201.35 0.7461 0.2083 0.000049 18.145 0.0142 0.0103 0.000002 47.967 0.0070 0.0071 0.000003
3 30,000 0.8 0.010 -1 31.85 0.9804 0.4020 0.000598 1089.98 0.9597 0.5647 0.001194 5.208 0.0280 0.0834 0.000198 35.158 0.0053 0.0136 0.000060
4 300,000 0.8 0.010 -1 2760.68 0.6284 0.0484 0.000017 5349.44 0.6393 0.1305 0.000035 30.091 0.0084 0.0038 0.000001 43.729 0.0069 0.0045 0.000002
5 30,000 0.3 0.001 -1 86.90 0.7591 0.0892 0.000214 772.12 0.8957 0.6724 0.001474 6.317 0.0437 0.0290 0.000054 25.913 0.0090 0.0140 0.000088
6 300,000 0.3 0.001 -1 568.70 0.5509 0.0330 0.000016 1211.96 0.7303 0.4300 0.000099 14.074 0.0225 0.0078 0.000002 24.787 0.0128 0.0127 0.000006
7 30,000 0.8 0.001 -1 276.26 0.6273 0.0464 0.000168 986.59 0.8043 0.5276 0.001198 10.413 0.0271 0.0123 0.000027 26.682 0.0117 0.0138 0.000075
8 300,000 0.8 0.001 -1 727.10 0.5257 0.0287 0.000016 1326.17 0.7068 0.3971 0.000093 12.491 0.0171 0.0061 0.000001 24.072 0.0127 0.0105 0.000005
9 30,000 0.3 0.010 0 1.45 1.0000 0.8610 0.001661 627.16 0.9985 0.8888 0.001825 1.298 0.0000 0.2745 0.005559 25.180 0.0015 0.0128 0.000090
10 300,000 0.3 0.010 0 882.45 0.7297 0.0807 0.000012 3054.78 0.7427 0.2176 0.000046 18.435 0.0152 0.0098 0.000001 45.422 0.0070 0.0066 0.000002
11 30,000 0.8 0.010 0 48.06 0.9509 0.2908 0.000245 1110.51 0.9407 0.5435 0.001091 6.350 0.0301 0.0645 0.000050 29.205 0.0068 0.0123 0.000060
12 300,000 0.8 0.010 0 2586.78 0.6204 0.0477 0.000011 5032.06 0.6455 0.1375 0.000032 32.771 0.0096 0.0038 0.000000 45.218 0.0062 0.0046 0.000002
13 30,000 0.3 0.001 0 88.32 0.7254 0.0774 0.000116 755.09 0.8953 0.6820 0.001491 6.377 0.0480 0.0279 0.000025 22.799 0.0097 0.0131 0.000077
14 300,000 0.3 0.001 0 531.27 0.5560 0.0334 0.000012 1183.18 0.7402 0.4407 0.000100 12.431 0.0215 0.0074 0.000001 27.843 0.0121 0.0091 0.000006
15 30,000 0.8 0.001 0 257.44 0.6234 0.0473 0.000106 953.63 0.8130 0.5449 0.001202 8.936 0.0276 0.0120 0.000013 23.262 0.0100 0.0115 0.000070
16 300,000 0.8 0.001 0 691.46 0.5268 0.0301 0.000013 1301.33 0.7114 0.4040 0.000092 13.526 0.0185 0.0066 0.000001 29.469 0.0110 0.0123 0.000005
17 30,000 0.3 0.010 1 2.55 0.9941 0.7339 0.000610 641.18 0.9966 0.8643 0.001777 1.623 0.0405 0.3034 0.000674 25.497 0.0022 0.0126 0.000107
18 300,000 0.3 0.010 1 919.28 0.7031 0.0706 0.000010 2902.71 0.7313 0.2228 0.000046 15.799 0.0142 0.0079 0.000001 38.290 0.0086 0.0073 0.000003
19 30,000 0.8 0.010 1 68.13 0.9178 0.2410 0.000198 1141.84 0.9208 0.5196 0.001047 7.795 0.0370 0.0486 0.000087 29.135 0.0077 0.0144 0.000063
20 300,000 0.8 0.010 1 2433.27 0.6105 0.0462 0.000009 4634.50 0.6464 0.1461 0.000032 30.705 0.0107 0.0047 0.000000 48.828 0.0078 0.0046 0.000002
21 30,000 0.3 0.001 1 93.15 0.7056 0.0684 0.000096 742.12 0.8944 0.6949 0.001509 5.960 0.0428 0.0273 0.000015 23.899 0.0102 0.0140 0.000080
22 300,000 0.3 0.001 1 482.94 0.5516 0.0329 0.000009 1124.74 0.7510 0.4635 0.000103 12.742 0.0244 0.0072 0.000001 24.808 0.0116 0.0117 0.000006
23 30,000 0.8 0.001 1 244.29 0.6120 0.0448 0.000089 917.81 0.8199 0.5673 0.001249 9.237 0.0272 0.0142 0.000010 25.125 0.0110 0.0137 0.000084
24 300,000 0.8 0.001 1 632.74 0.5336 0.0294 0.000010 1235.79 0.7201 0.4192 0.000094 14.730 0.0194 0.0071 0.000001 25.984 0.0121 0.0112 0.000005

\(~\) \(~\) \(~\)

★ It is important to note here that for scenarios 1, 9 and 17, very few significant SNPs are detected on average. In some instances, we may even find that no SNPs are deemed significant at a threshold of \(5 \times 10^{-8}\). We must keep this observation in mind going forward as we investigate the performance of methods under these three scenarios.

For both thresholds, the average number of significant SNPs increases as sample size increases, as expected. It also increases with heritability. However, the effect of changing prop_effect is more interesting. Decreasing the proportion of effect SNPs from 0.01 to 0.001 results in the number of significant SNPs increasing for a sample size of 30,000 while we witness the number of SNPs passing the genome-wide significance threshold decreasing for a larger sample size of 300,000.

Furthermore, increasing sample size and increasing heritability from 0.3 to 0.8 all tend to decrease the fraction of significant SNPs whose estimates are more extreme than their true effect size. Decreasing polygenicity from 0.01 to 0.001 also has this same effect at a significance threshold of \(5 \times 10^{-8}\).

As sample size increases from 30,000 to 300,000 in all scenarios, prop_x, the proportion of significant SNPs which have been found to be significantly overestimated, decreases. This is a strong indicator that as the value of n_samples increases, we expect the bias induced by Winner’s Curse to be less of a problem among significant SNPs. However, this could also be due to the fact that as sample size increases, the number of significant SNPs passing the genome-wide significance threshold of 5e-8 also increases.

In order to gain a better insight into the information detailed in the above table, we simulate a single set of GWAS summary statistics and plot \(z\) vs \(\text{bias}\) in which \(\text{bias} = \hat\beta - \beta\) for each of the 24 different scenarios. On all figures, the bright red line corresponds to the significance threshold of \(5 \times 10^{-8}\) while the darker red line relates to \(5 \times 10^{-4}\). The points corresponding to SNPs which are significantly overestimated and are significant at a threshold of \(5 \times 10^{-4}\) are coloured in navy .

\(~\) \(~\) \(~\) \(~\)

2. Evaluating methods using a significance threshold of \(5 \times 10^{-8}\)

Using the code detailed in norm_5e-8_20sim.R and a total of 20 simulations, we evaluated six different Winner’s Curse methods across each of the 24 scenarios using the following three bias evaluation metrics:

  1. The average fraction of significant SNPs in which their association estimates have been improved due to method implementation - flb
  2. The average change in average MSE of significant SNPs due to method implementation - mse
  3. The average relative change in average MSE of significant SNPs due to method implementation - rel_mse

Note: All averages are obtained over only those simulations in which at least one significant SNP was detected.

Firstly, the fraction of \(n\) significant SNPs in which their association estimates have been improved due to method implementation may be mathematically described as: \[\frac{1}{n} \; \sum_{i=1}^{n}\mathbb{I} \left\{ \left| \hat\beta_i - \beta_i \right| > \left|\hat\beta_{\text{adj,}i} - \beta_i\right| \right\},\]in which \(\left| \frac{\hat\beta_i}{\hat\sigma_i} \right| > Z_{\frac{\alpha}{2}}\) for all \(i = 1,...,n\), where \(\hat\beta_i\) is the estimated naive effect size of SNP \(i\), \(\beta_i\) is its true effect size and \(\hat\beta_{\text{adj,}i}\) is its new effect size estimate obtained as a result of application of the Winner’s Curse adjustment method of interest. The significance threshold is represented by \(\alpha\).

Using the same notation, the average MSE over \(n\) significant SNPs is defined as: \[\frac{1}{n} \sum^n_{i=1} (\hat\beta_i - \beta_i)^2.\] Thus, using the above, we may formally define the change in average MSE of significant SNPs as: \[\frac{1}{n} \sum^n_{i=1} (\hat\beta_{\text{adj,}i} - \beta_i)^2 - \frac{1}{n} \sum^n_{i=1} (\hat\beta_i - \beta_i)^2\] and the relative change in average MSE of significant SNPs as: \[\frac{\frac{1}{n} \sum^n_{i=1} (\hat\beta_{\text{adj,}i} - \beta_i)^2 - \frac{1}{n} \sum^n_{i=1} (\hat\beta_i - \beta_i)^2}{\frac{1}{n} \sum^n_{i=1} (\hat\beta_i - \beta_i)^2}.\]

\(~\)

Results of the simulations are plotted. Error bars are also included in the plots. These figures allow us to see more clearly the scenarios in which it would be beneficial to apply a Winner’s Curse correction method and also, provide us with a better indication of which method we should use.

A replication method which selects significant SNPs from a discovery GWAS and then uses a replication GWAS of the same size to obtain association estimates for these SNPs is also included in the plots. This can be viewed as acting as a form of benchmark for the other methods.

Summary of results for flb contained in norm_5e-8_20sim.csv:

Scenario n_samples h2 prop_effect S EB FIQT BR cl1 cl2 cl3 rep
1 30,000 0.3 0.010 -1 1.0000 1.0000 1.0000 1.0000 1.0000 1.0000 1.0000
2 300,000 0.3 0.010 -1 0.6396 0.4446 0.6034 0.5289 0.4974 0.5095 0.5612
3 30,000 0.8 0.010 -1 0.9140 0.7580 0.7802 0.6152 0.7338 0.6657 0.8166
4 300,000 0.8 0.010 -1 0.5673 0.2811 0.5319 0.5174 0.4865 0.5014 0.5138
5 30,000 0.3 0.001 -1 0.6312 0.3257 0.4595 0.5253 0.4911 0.5068 0.5631
6 300,000 0.3 0.001 -1 0.4926 0.1340 0.2759 0.5097 0.4778 0.5017 0.5002
7 30,000 0.8 0.001 -1 0.5230 0.2248 0.2974 0.5213 0.4886 0.5037 0.5118
8 300,000 0.8 0.001 -1 0.4757 0.1053 0.3009 0.5028 0.4711 0.4894 0.5082
9 30,000 0.3 0.010 0 0.9062 0.9375 0.8438 0.9062 0.9375 0.9062 0.9375
10 300,000 0.3 0.010 0 0.6212 0.4058 0.5772 0.5291 0.4977 0.5120 0.5466
11 30,000 0.8 0.010 0 0.8435 0.6979 0.7417 0.5920 0.6757 0.6251 0.7557
12 300,000 0.8 0.010 0 0.5589 0.2708 0.5246 0.5120 0.4817 0.4958 0.5092
13 30,000 0.3 0.001 0 0.6006 0.3048 0.4487 0.5061 0.4820 0.4932 0.5441
14 300,000 0.3 0.001 0 0.4862 0.1390 0.2821 0.5043 0.4742 0.4964 0.4942
15 30,000 0.8 0.001 0 0.5159 0.2013 0.2860 0.5130 0.4872 0.5008 0.5100
16 300,000 0.8 0.001 0 0.4635 0.1089 0.3005 0.5023 0.4653 0.4833 0.5002
17 30,000 0.3 0.010 1 0.7792 0.8292 0.8083 0.6670 0.8708 0.7620 0.9358
18 300,000 0.3 0.010 1 0.5961 0.3647 0.5290 0.5219 0.4902 0.5033 0.5399
19 30,000 0.8 0.010 1 0.7829 0.5712 0.6422 0.5588 0.5879 0.5679 0.6950
20 300,000 0.8 0.010 1 0.5510 0.2573 0.5109 0.5095 0.4803 0.4957 0.5121
21 30,000 0.3 0.001 1 0.5875 0.2870 0.4261 0.5109 0.4843 0.4958 0.5459
22 300,000 0.3 0.001 1 0.4799 0.1429 0.2779 0.5050 0.4704 0.4960 0.4955
23 30,000 0.8 0.001 1 0.4990 0.1928 0.2653 0.5044 0.4802 0.4915 0.5176
24 300,000 0.8 0.001 1 0.4653 0.1098 0.3093 0.5043 0.4576 0.4790 0.5022

\(~\) \(~\) \(~\) \(~\)

Fraction of significant SNPs with improved association estimates due to method implementation, using a significance threshold of \(5 \times 10^{-8}\):

\(~\) \(~\) \(~\) \(~\)

Summary of results for mse contained in norm_5e-8_20sim.csv:

Scenario n_samples h2 prop_effect S EB FIQT BR cl1 cl2 cl3 rep
1 30,000 0.3 0.010 -1 -0.001739 -0.001747 -0.001756 -0.001440 -0.001631 -0.001592 -0.001604
2 300,000 0.3 0.010 -1 -0.000007 0.000001 -0.000004 0.000053 0.000023 0.000035 -0.000005
3 30,000 0.8 0.010 -1 -0.000435 -0.000453 -0.000464 0.000158 -0.000327 -0.000139 -0.000442
4 300,000 0.8 0.010 -1 -0.000002 0.000005 0.000000 0.000035 0.000020 0.000026 -0.000001
5 30,000 0.3 0.001 -1 -0.000032 0.000130 0.000073 0.000599 0.000245 0.000391 -0.000040
6 300,000 0.3 0.001 -1 0.000001 0.000006 0.000013 0.000014 0.000010 0.000012 0.000000
7 30,000 0.8 0.001 -1 -0.000001 0.000134 0.000163 0.000322 0.000192 0.000243 -0.000015
8 300,000 0.8 0.001 -1 0.000001 0.000003 0.000007 0.000008 0.000023 0.000011 0.000000
9 30,000 0.3 0.010 0 -0.000742 -0.000767 -0.000719 -0.000592 -0.000750 -0.000703 -0.000670
10 300,000 0.3 0.010 0 -0.000003 0.000000 -0.000002 0.000026 0.000011 0.000017 -0.000003
11 30,000 0.8 0.010 0 -0.000144 -0.000137 -0.000149 0.000184 -0.000065 0.000035 -0.000153
12 300,000 0.8 0.010 0 -0.000001 0.000002 0.000000 0.000021 0.000012 0.000015 -0.000001
13 30,000 0.3 0.001 0 -0.000021 0.000073 0.000038 0.000292 0.000124 0.000193 -0.000028
14 300,000 0.3 0.001 0 0.000000 0.000005 0.000009 0.000014 0.000009 0.000011 0.000000
15 30,000 0.8 0.001 0 -0.000001 0.000069 0.000105 0.000232 0.000121 0.000166 -0.000012
16 300,000 0.8 0.001 0 0.000002 0.000004 0.000006 0.000011 0.000026 0.000014 0.000000
17 30,000 0.3 0.010 1 -0.000500 -0.000565 -0.000532 -0.000297 -0.000526 -0.000449 -0.000609
18 300,000 0.3 0.010 1 -0.000002 0.000002 0.000000 0.000022 0.000011 0.000015 -0.000002
19 30,000 0.8 0.010 1 -0.000113 -0.000083 -0.000102 0.000214 -0.000011 0.000080 -0.000125
20 300,000 0.8 0.010 1 -0.000001 0.000002 0.000000 0.000017 0.000009 0.000012 -0.000001
21 30,000 0.3 0.001 1 -0.000015 0.000072 0.000046 0.000246 0.000109 0.000165 -0.000026
22 300,000 0.3 0.001 1 0.000000 0.000004 0.000008 0.000010 0.000007 0.000008 0.000000
23 30,000 0.8 0.001 1 0.000003 0.000056 0.000103 0.000156 0.000089 0.000116 -0.000005
24 300,000 0.8 0.001 1 0.000001 0.000003 0.000005 0.000008 0.000044 0.000016 0.000000

\(~\) \(~\) \(~\) \(~\)

Change in average MSE over all significant SNPs due to method implementation, using a significance threshold of \(5 \times 10^{-8}\):

\(~\) \(~\) \(~\) \(~\)

Summary of results for rel_mse contained in norm_5e-8_20sim.csv:

Scenario n_samples h2 prop_effect S EB FIQT BR cl1 cl2 cl3 rep
1 30,000 0.3 0.010 -1 -0.8659 -0.8976 -0.8466 -0.6782 -0.8740 -0.8233 -0.8254
2 300,000 0.3 0.010 -1 -0.3143 0.0630 -0.2040 2.5177 1.0966 1.6754 -0.2480
3 30,000 0.8 0.010 -1 -0.7314 -0.7086 -0.7393 0.3578 -0.5004 -0.1665 -0.7096
4 300,000 0.8 0.010 -1 -0.0977 0.2689 -0.0143 2.0181 1.1567 1.5003 -0.0817
5 30,000 0.3 0.001 -1 -0.1375 0.7887 0.4699 3.3286 1.4118 2.2051 -0.1428
6 300,000 0.3 0.001 -1 0.0349 0.4142 0.8520 0.9163 0.6300 0.7364 -0.0251
7 30,000 0.8 0.001 -1 -0.0032 0.7892 0.9577 1.8979 1.1281 1.4273 -0.0683
8 300,000 0.8 0.001 -1 0.0638 0.2201 0.4378 0.5427 1.4651 0.6953 0.0091
9 30,000 0.3 0.010 0 -0.5376 -0.6701 -0.4098 0.1175 -0.6814 -0.3631 -0.6809
10 300,000 0.3 0.010 0 -0.2770 0.0366 -0.1718 2.2535 0.9473 1.4787 -0.2795
11 30,000 0.8 0.010 0 -0.6387 -0.5912 -0.6478 0.8678 -0.2670 0.1896 -0.6672
12 300,000 0.8 0.010 0 -0.1012 0.1955 -0.0384 1.9523 1.1193 1.4323 -0.1073
13 30,000 0.3 0.001 0 -0.1800 0.6662 0.3534 2.5659 1.1128 1.7091 -0.2209
14 300,000 0.3 0.001 0 0.0303 0.4008 0.7867 1.2410 0.7627 0.9444 -0.0137
15 30,000 0.8 0.001 0 -0.0099 0.6751 1.0223 2.2359 1.1727 1.6048 -0.1075
16 300,000 0.8 0.001 0 0.1443 0.3015 0.4888 0.8795 2.0278 1.0410 -0.0314
17 30,000 0.3 0.010 1 -0.6014 -0.6888 -0.6122 0.1929 -0.6644 -0.3321 -0.8068
18 300,000 0.3 0.010 1 -0.2023 0.2016 -0.0206 2.2877 1.1015 1.5804 -0.2169
19 30,000 0.8 0.010 1 -0.5828 -0.3981 -0.5037 1.2294 0.0019 0.4995 -0.6136
20 300,000 0.8 0.010 1 -0.0734 0.2387 0.0130 1.8655 1.0276 1.3627 -0.0955
21 30,000 0.3 0.001 1 -0.1503 0.7625 0.4876 2.5461 1.1361 1.7120 -0.2582
22 300,000 0.3 0.001 1 0.0506 0.4088 0.8656 1.1302 0.7106 0.8616 -0.0083
23 30,000 0.8 0.001 1 0.0373 0.6623 1.2072 1.8365 1.0397 1.3586 -0.0476
24 300,000 0.8 0.001 1 0.1362 0.2605 0.4774 0.8255 4.3116 1.5529 -0.0373

\(~\) \(~\) \(~\) \(~\)

The mean relative change in average MSE over all significant SNPs across the 24 scenarios is obtained for each method:

##         EB       FIQT         BR        cl1        cl2        cl3        rep 
## -0.2086333  0.1416583  0.1754542  1.4553375  0.8488625  0.9951000 -0.2618500

The methods are ranked according to the results above in ascending order:

##   EB FIQT   BR  cl1  cl2  cl3  rep 
##    2    3    4    7    5    6    1

\(~\) \(~\) \(~\) \(~\)

Relative change in average MSE over all significant SNPs due to method implementation, using a significance threshold of \(5 \times 10^{-8}\):

\(~\) \(~\) \(~\) \(~\)

3. Observations and discussion of above simulations

It is worth noting from the above simulations how the Winner’s Curse methods tend to break down, or equivalently, no longer make improvements based on the third evaluation metric, rel_mse when the proportion of effect SNPs is 0.001. This is a measure of polygenicity. That said, the empirical Bayes method performs extremely similar to just taking a replication estimate when the replication sample is of the same size as the discovery GWAS, i.e. n_samples is equivalent in both data sets and as we will see later, the empirical Bayes will outperform the replication method when the replication GWAS sample size becomes smaller than that of the discovery, at this threshold of 5e-8. This observation strongly supports the use of empirical Bayes as a Winner’s Curse adjustment method, particularly when a replication GWAS is not available.

For prop_effect = 0.01, our proposed bootstrap method for summary statistics performs in a comparable manner to both empirical Bayes and the replication method. However, the bootstrap method ceases to perform as well at obtaining less biased association estimates when the proportion of effect SNPs is reduced to 0.001.

On all occasions, the conditional likelihood methods tend to perform poorly compared to the other methods.

Clearly, even though we also look at using a significance threshold of \(5 \times 10^{-4}\), it is the genome-wide significance threshold of \(5 \times 10^{-8}\) that is of most interest to us as using the higher threshold of \(5 \times 10^{-4}\) often results in the detection of many false positives.

4. Evaluating methods using a significance threshold of \(5 \times 10^{-4}\)

Similar to part 2 above, we use the code detailed in norm_5e-4_10sim.R with a total of 10 simulations in order to evaluate seven different Winner’s Curse methods across each of the 24 scenarios. In the following investigations, we will concentrate on the final bias evaluation metric, rel_mse.

\(~\) \(~\) \(~\) \(~\)

Summary of results for rel_mse contained in norm_5e-4_10sim.csv:

Scenario n_samples h2 prop_effect S EB FIQT BR cl1 cl2 cl3 rep
1 30,000 0.3 0.010 -1 -0.9516 -0.9538 -0.8959 -0.8370 -0.7039 -0.7932 -0.9216
2 300,000 0.3 0.010 -1 -0.4013 -0.2231 -0.2830 -0.2135 -0.3549 -0.3199 -0.6819
3 30,000 0.8 0.010 -1 -0.7634 -0.7475 -0.7374 -0.6905 -0.6600 -0.7033 -0.8664
4 300,000 0.8 0.010 -1 -0.2859 -0.0602 -0.1729 -0.1374 -0.2304 -0.2131 -0.5530
5 30,000 0.3 0.001 -1 -0.8184 -0.7769 -0.7278 -0.7638 -0.6496 -0.7286 -0.8955
6 300,000 0.3 0.001 -1 -0.7582 -0.6891 -0.5591 -0.7297 -0.6083 -0.6904 -0.8329
7 30,000 0.8 0.001 -1 -0.7888 -0.7142 -0.6462 -0.7525 -0.6325 -0.7150 -0.8723
8 300,000 0.8 0.001 -1 -0.7390 -0.7006 -0.5663 -0.7322 -0.4960 -0.6632 -0.8239
9 30,000 0.3 0.010 0 -0.9602 -0.9623 -0.8920 -0.8364 -0.6981 -0.7889 -0.9228
10 300,000 0.3 0.010 0 -0.5172 -0.4540 -0.4662 -0.4573 -0.5001 -0.5093 -0.7680
11 30,000 0.8 0.010 0 -0.8003 -0.8057 -0.7681 -0.7498 -0.6698 -0.7350 -0.8929
12 300,000 0.8 0.010 0 -0.3762 -0.2532 -0.3139 -0.2712 -0.3518 -0.3419 -0.6549
13 30,000 0.3 0.001 0 -0.8694 -0.8520 -0.7850 -0.8045 -0.6715 -0.7598 -0.9124
14 300,000 0.3 0.001 0 -0.7661 -0.7077 -0.5836 -0.7417 -0.6210 -0.7032 -0.8567
15 30,000 0.8 0.001 0 -0.8080 -0.7669 -0.6897 -0.7596 -0.6432 -0.7231 -0.8884
16 300,000 0.8 0.001 0 -0.7355 -0.7030 -0.5716 -0.7424 -0.4546 -0.6598 -0.8420
17 30,000 0.3 0.010 1 -0.9483 -0.9552 -0.8843 -0.8346 -0.6971 -0.7874 -0.9221
18 300,000 0.3 0.010 1 -0.5410 -0.4831 -0.4843 -0.5187 -0.5206 -0.5474 -0.7940
19 30,000 0.8 0.010 1 -0.7916 -0.7933 -0.7542 -0.7493 -0.6628 -0.7304 -0.8932
20 300,000 0.8 0.010 1 -0.4445 -0.3419 -0.3732 -0.4101 -0.4267 -0.4454 -0.7067
21 30,000 0.3 0.001 1 -0.8894 -0.8731 -0.8004 -0.8196 -0.6786 -0.7710 -0.9139
22 300,000 0.3 0.001 1 -0.8113 -0.7658 -0.6339 -0.7680 -0.6377 -0.7245 -0.8767
23 30,000 0.8 0.001 1 -0.8456 -0.8061 -0.7222 -0.7801 -0.6548 -0.7385 -0.8952
24 300,000 0.8 0.001 1 -0.7679 -0.7470 -0.6121 -0.7711 -0.4363 -0.6760 -0.8655

\(~\) \(~\) \(~\) \(~\)

The mean relative change in average MSE over all significant SNPs across the 24 scenarios is obtained for each method:

##         EB       FIQT         BR        cl1        cl2        cl3        rep 
## -0.7241292 -0.6723208 -0.6218042 -0.6612917 -0.5691792 -0.6445125 -0.8355375

The methods are ranked according to the results above in ascending order:

##   EB FIQT   BR  cl1  cl2  cl3  rep 
##    2    3    6    4    7    5    1

\(~\) \(~\) \(~\) \(~\)

Relative change in average MSE over all significant SNPs due to method implementation, using a significance threshold of \(5 \times 10^{-4}\):

\(~\) \(~\) \(~\) \(~\)

5. Bimodal distribution of effect sizes

Here we investigate the 24 different scenarios under a bimodal distribution of effect sizes. In order to create a bimodal distribution, we simulate 50% of effect sizes of the true effect SNPs from a normal distribution centered at 0 while the other half are generated from a normal distribution with mean 2.5. As above, we first have a look at the expected number of significant SNPs and the expected proportion of those in which their association estimate is significantly exaggerated.

Running the code provided in nsig_prop_bias_100sim.R, we obtain the following results:

Scenario n_samples h2 prop_effect S n_sig 5e-8 prop_bias 5e-8 prop_x 5e-8 mse 5e-8 sd(n_sig) 5e-8 sd(prop_bias) 5e-8 sd(prop_x) 5e-8 sd(mse) 5e-8
1 30,000 0.3 0.010 -1 0.61 1.0000 0.9583 0.001874 0.803 0.0000 0.1729 0.003349
2 300,000 0.3 0.010 -1 855.06 0.7741 0.0910 0.000016 19.640 0.0127 0.0090 0.000001
3 30,000 0.8 0.010 -1 24.74 0.9968 0.4860 0.000474 5.181 0.0111 0.1190 0.000207
4 300,000 0.8 0.010 -1 2820.65 0.6243 0.0470 0.000014 27.505 0.0085 0.0043 0.000001
5 30,000 0.3 0.001 -1 85.88 0.7709 0.0882 0.000161 5.689 0.0503 0.0294 0.000041
6 300,000 0.3 0.001 -1 574.45 0.5513 0.0337 0.000015 12.756 0.0207 0.0080 0.000002
7 30,000 0.8 0.001 -1 281.92 0.6252 0.0475 0.000136 8.026 0.0287 0.0118 0.000022
8 300,000 0.8 0.001 -1 729.09 0.5264 0.0288 0.000015 12.631 0.0177 0.0056 0.000001
9 30,000 0.3 0.010 0 0.44 1.0000 1.0000 0.002490 0.686 0.0000 0.0000 0.006906
10 300,000 0.3 0.010 0 906.91 0.7750 0.0855 0.000012 18.945 0.0132 0.0094 0.000001
11 30,000 0.8 0.010 0 22.50 1.0000 0.5724 0.000392 4.520 0.0000 0.1058 0.000087
12 300,000 0.8 0.010 0 2867.23 0.6130 0.0442 0.000010 29.528 0.0090 0.0038 0.000000
13 30,000 0.3 0.001 0 91.79 0.7793 0.0867 0.000120 6.703 0.0456 0.0286 0.000026
14 300,000 0.3 0.001 0 543.36 0.5426 0.0315 0.000012 10.686 0.0230 0.0083 0.000001
15 30,000 0.8 0.001 0 285.41 0.6093 0.0420 0.000101 9.313 0.0282 0.0111 0.000013
16 300,000 0.8 0.001 0 693.12 0.5289 0.0307 0.000013 12.186 0.0176 0.0075 0.000001
17 30,000 0.3 0.010 1 0.52 1.0000 1.0000 0.001402 0.717 0.0000 0.0000 0.001109
18 300,000 0.3 0.010 1 918.79 0.8193 0.0945 0.000012 19.846 0.0133 0.0096 0.000001
19 30,000 0.8 0.010 1 15.89 1.0000 0.7521 0.000474 3.408 0.0000 0.1188 0.000097
20 300,000 0.8 0.010 1 3130.86 0.6014 0.0392 0.000010 26.113 0.0098 0.0034 0.000000
21 30,000 0.3 0.001 1 91.90 0.8221 0.0879 0.000119 5.901 0.0404 0.0293 0.000023
22 300,000 0.3 0.001 1 521.36 0.5350 0.0309 0.000012 8.481 0.0212 0.0081 0.000001
23 30,000 0.8 0.001 1 313.57 0.5995 0.0392 0.000095 8.843 0.0282 0.0123 0.000010
24 300,000 0.8 0.001 1 628.44 0.5231 0.0298 0.000013 10.003 0.0184 0.0065 0.000001

\(~\) \(~\) \(~\) \(~\)

Next, we repeat the process illustrated in Section 2 using the third bias evaluation metric, rel_mse with a significance threshold of \(5 \times 10^{-8}\).

Summary of results for rel_mse contained in bimod_5e-8_10sim.csv:

Scenario n_samples h2 prop_effect S EB FIQT BR cl1 cl2 cl3 rep
1 30,000 0.3 0.010 -1 -0.8652 -0.8611 -0.8801 -0.6651 -0.7987 -0.7563 -0.8911
2 300,000 0.3 0.010 -1 -0.4005 -0.0961 -0.3535 2.3903 0.9330 1.5280 -0.3260
3 30,000 0.8 0.010 -1 -0.8398 -0.7968 -0.8554 0.4373 -0.6085 -0.2000 -0.7343
4 300,000 0.8 0.010 -1 -0.1200 0.2525 -0.0535 2.2781 1.2585 1.6702 -0.0908
5 30,000 0.3 0.001 -1 -0.1251 0.4392 0.0816 2.4621 0.9733 1.5853 -0.3879
6 300,000 0.3 0.001 -1 0.0368 0.4734 0.7842 1.2027 0.7414 0.9233 0.0476
7 30,000 0.8 0.001 -1 -0.0272 0.9478 0.8110 2.5743 1.4722 1.9121 -0.0455
8 300,000 0.8 0.001 -1 0.0758 0.2279 0.5819 0.6036 0.7481 0.5433 0.0105
9 30,000 0.3 0.010 0 -0.8787 -0.9023 -0.8513 -0.5169 -0.9215 -0.7961 -0.8889
10 300,000 0.3 0.010 0 -0.4038 -0.0690 -0.3713 2.4564 1.0098 1.5992 -0.3064
11 30,000 0.8 0.010 0 -0.8078 -0.8759 -0.8852 0.0434 -0.7177 -0.4219 -0.7881
12 300,000 0.8 0.010 0 -0.1118 0.2409 -0.1032 2.0764 1.1539 1.5229 -0.1016
13 30,000 0.3 0.001 0 -0.3081 0.6181 0.0185 3.1434 1.3302 2.0801 -0.2942
14 300,000 0.3 0.001 0 0.0274 0.3868 0.5348 1.1949 0.7106 0.9010 0.0196
15 30,000 0.8 0.001 0 -0.0590 0.6341 0.4830 1.8248 0.9890 1.3211 -0.1607
16 300,000 0.8 0.001 0 0.0406 0.3382 0.6162 1.1115 0.9371 0.9084 0.0302
17 30,000 0.3 0.010 1 -0.9393 -0.9454 -0.9415 -0.7304 -0.9411 -0.9054 -0.8765
18 300,000 0.3 0.010 1 -0.4866 -0.2098 -0.4695 2.7316 0.9980 1.7135 -0.3306
19 30,000 0.8 0.010 1 -0.8511 -0.9174 -0.9161 -0.2153 -0.7834 -0.5711 -0.8449
20 300,000 0.8 0.010 1 -0.1137 0.3166 -0.1154 2.1944 1.2935 1.6506 -0.0561
21 30,000 0.3 0.001 1 -0.5084 0.0929 -0.3487 2.3442 0.7355 1.3987 -0.3590
22 300,000 0.3 0.001 1 0.0575 0.3490 0.5646 1.2555 0.9525 0.9916 -0.0665
23 30,000 0.8 0.001 1 0.0243 0.8020 0.4923 2.3288 1.3454 1.7375 -0.0716
24 300,000 0.8 0.001 1 0.0481 0.2526 0.4966 0.6978 0.5453 0.5735 0.0543

\(~\) \(~\) \(~\) \(~\)

The mean relative change in average MSE over all significant SNPs across the 24 scenarios is obtained for each method:

##          EB        FIQT          BR         cl1         cl2         cl3 
## -0.31398333  0.02909167 -0.07000000  1.38432500  0.55651667  0.87122917 
##         rep 
## -0.31077083

The methods are ranked according to the results above in ascending order:

##   EB FIQT   BR  cl1  cl2  cl3  rep 
##    1    4    3    7    5    6    2

\(~\) \(~\) \(~\) \(~\)

Relative change in average MSE over all significant SNPs due to method implementation, using a significance threshold of \(5 \times 10^{-8}\) and a bimodal distribution of effect sizes:

\(~\) \(~\) \(~\) \(~\)

6. Evaluating methods which use both replication and discovery summary statistics

In this section, we proceed to comparing Winner’s Curse adjustment methods which use summary statistics from both the discovery and a replication GWAS. We first consider the situation in which the replication and discovery GWASs are of the same size. As an interesting comparison, we have also included the empirical Bayes method here which uses information from just the discovery GWAS. The significance threshold is set to 5e-8 and in this instance, the effect distribution is normal.

Summary of results for rel_mse contained in replicate_norm_5e-8_10sim.csv:

Scenario n_samples h2 prop_effect S EB rep UMVCUE cl1_com cl2_MLE cl3_MSE MSE_min MSE_min_sp EB_com
1 30,000 0.3 0.010 -1 -0.6882 -0.9846 -0.7103 -0.7103 -0.9982 -0.9833 -0.9470 -0.9596 -0.9835
2 300,000 0.3 0.010 -1 -0.3039 -0.2882 -0.2875 -0.5715 -0.5010 -0.5468 -0.4586 -0.5787 -0.6396
3 30,000 0.8 0.010 -1 -0.6578 -0.7883 -0.7925 -0.6863 -0.8269 -0.8061 -0.7867 -0.7849 -0.8828
4 300,000 0.8 0.010 -1 -0.0822 -0.0425 -0.0432 -0.5127 -0.4389 -0.4863 -0.2933 -0.5127 -0.5345
5 30,000 0.3 0.001 -1 -0.1343 -0.2324 -0.2379 -0.5933 -0.4923 -0.5561 -0.4101 -0.5986 -0.5979
6 300,000 0.3 0.001 -1 0.0362 0.0439 0.0419 -0.4850 -0.4431 -0.4703 -0.2265 -0.4795 -0.4657
7 30,000 0.8 0.001 -1 0.0393 -0.0825 -0.0868 -0.5366 -0.4908 -0.5311 -0.3227 -0.5310 -0.5391
8 300,000 0.8 0.001 -1 0.0787 -0.0052 -0.0096 -0.4909 -0.4644 -0.4833 -0.2626 -0.4887 -0.4678
9 30,000 0.3 0.010 0 -0.6875 -0.6995 -0.6209 -0.5615 -0.8109 -0.7351 -0.6391 -0.6618 -0.8954
10 300,000 0.3 0.010 0 -0.2877 -0.3129 -0.3139 -0.5805 -0.5447 -0.5762 -0.4770 -0.5909 -0.6496
11 30,000 0.8 0.010 0 -0.6383 -0.7218 -0.7248 -0.6759 -0.7999 -0.7777 -0.7399 -0.7650 -0.8483
12 300,000 0.8 0.010 0 -0.0966 -0.1277 -0.1278 -0.5285 -0.4651 -0.5063 -0.3467 -0.5294 -0.5521
13 30,000 0.3 0.001 0 -0.1844 -0.2980 -0.3077 -0.5853 -0.5460 -0.5770 -0.4712 -0.5854 -0.6108
14 300,000 0.3 0.001 0 0.0482 0.0126 0.0126 -0.5011 -0.4424 -0.4783 -0.2647 -0.4946 -0.4807
15 30,000 0.8 0.001 0 0.0058 -0.0662 -0.0721 -0.5415 -0.4477 -0.5004 -0.3086 -0.5315 -0.5255
16 300,000 0.8 0.001 0 0.1245 -0.0364 -0.0358 -0.5072 -0.4660 -0.4906 -0.2879 -0.5012 -0.4653
17 30,000 0.3 0.010 1 -0.6981 -0.8833 -0.8371 -0.6669 -0.9011 -0.8732 -0.8594 -0.8546 -0.8706
18 300,000 0.3 0.010 1 -0.2217 -0.2487 -0.2494 -0.5630 -0.5114 -0.5477 -0.4266 -0.5703 -0.6116
19 30,000 0.8 0.010 1 -0.5748 -0.6226 -0.6211 -0.6454 -0.7037 -0.6968 -0.6611 -0.7108 -0.7825
20 300,000 0.8 0.010 1 -0.0721 -0.0740 -0.0743 -0.5209 -0.4465 -0.4912 -0.3145 -0.5211 -0.5344
21 30,000 0.3 0.001 1 -0.1841 -0.2900 -0.2943 -0.5556 -0.5136 -0.5407 -0.4437 -0.5579 -0.5866
22 300,000 0.3 0.001 1 0.0408 -0.0108 -0.0172 -0.4975 -0.4613 -0.4878 -0.2637 -0.4883 -0.4829
23 30,000 0.8 0.001 1 0.0218 -0.1438 -0.1439 -0.5176 -0.4641 -0.5037 -0.3492 -0.5119 -0.5172
24 300,000 0.8 0.001 1 0.1391 -0.0311 -0.0334 -0.5134 -0.4676 -0.4952 -0.2823 -0.4984 -0.4439

\(~\) \(~\) \(~\) \(~\)

The mean relative change in average MSE over all significant SNPs across the 24 scenarios is obtained for each method:

##         EB        rep     UMVCUE    cl1_com    cl2_MLE    cl3_MSE    MSE_min 
## -0.2073875 -0.2889167 -0.2744583 -0.5645167 -0.5686500 -0.5892167 -0.4517958 
## MSE_min_sp     EB_com 
## -0.5961167 -0.6236792

The methods are ranked according to the results above in ascending order:

##         EB        rep     UMVCUE    cl1_com    cl2_MLE    cl3_MSE    MSE_min 
##          9          7          8          5          4          3          6 
## MSE_min_sp     EB_com 
##          2          1

\(~\) \(~\) \(~\) \(~\)

Relative change in average MSE over all significant SNPs due to method implementation with a replication and discovery GWAS of equal size, using a significance threshold of \(5 \times 10^{-8}\):

\(~\) \(~\) \(~\) \(~\)

We also look at how the methods compare when the replication data set is 50% that of the discovery data set.

Summary of results for rel_mse contained in replicate_norm_5e-8_10sim_halfrep.csv:

Scenario n_samples h2 prop_effect S EB rep UMVCUE cl1_com cl2_MLE cl3_MSE MSE_min MSE_min_sp EB_com
1 30,000 0.3 0.010 -1 -0.6882 -0.9692 -0.5070 -0.5070 -0.9972 -0.9898 -0.8918 -0.9156 -0.9385
2 300,000 0.3 0.010 -1 -0.3039 0.4235 0.4255 -0.3967 -0.1935 -0.2784 -0.0177 -0.3682 -0.5212
3 30,000 0.8 0.010 -1 -0.6578 -0.5765 -0.5905 -0.4949 -0.7084 -0.6838 -0.6223 -0.6608 -0.8417
4 300,000 0.8 0.010 -1 -0.0822 0.9149 0.9130 -0.3454 -0.1622 -0.2502 0.3008 -0.2630 -0.3842
5 30,000 0.3 0.001 -1 -0.1343 0.5352 0.5164 -0.4292 -0.1783 -0.2926 0.0679 -0.3806 -0.4894
6 300,000 0.3 0.001 -1 0.0362 1.0879 1.0790 -0.3185 -0.2162 -0.2636 0.4280 -0.2055 -0.3000
7 30,000 0.8 0.001 -1 0.0393 0.8351 0.8211 -0.3718 -0.2460 -0.3236 0.2452 -0.2839 -0.3534
8 300,000 0.8 0.001 -1 0.0787 0.9897 0.9751 -0.3215 -0.2673 -0.2974 0.3496 -0.2250 -0.2294
9 30,000 0.3 0.010 0 -0.6875 -0.3990 -0.4354 -0.3463 -0.7242 -0.6248 -0.4229 -0.4559 -0.9108
10 300,000 0.3 0.010 0 -0.2877 0.3742 0.3716 -0.4058 -0.2820 -0.3437 -0.0460 -0.3863 -0.5259
11 30,000 0.8 0.010 0 -0.6383 -0.4437 -0.4521 -0.4880 -0.6691 -0.6445 -0.5414 -0.6195 -0.7761
12 300,000 0.8 0.010 0 -0.0966 0.7446 0.7442 -0.3573 -0.1895 -0.2678 0.1942 -0.2984 -0.4015
13 30,000 0.3 0.001 0 -0.1844 0.4040 0.3789 -0.4131 -0.2849 -0.3460 -0.0332 -0.3750 -0.5032
14 300,000 0.3 0.001 0 0.0482 1.0252 1.0220 -0.3358 -0.1968 -0.2597 0.3561 -0.2248 -0.3109
15 30,000 0.8 0.001 0 0.0058 0.8677 0.8524 -0.3794 -0.1703 -0.2612 0.2766 -0.2789 -0.3435
16 300,000 0.8 0.001 0 0.1245 0.9271 0.9237 -0.3391 -0.2548 -0.2960 0.3060 -0.2387 -0.1710
17 30,000 0.3 0.010 1 -0.6981 -0.7666 -0.6939 -0.4618 -0.8253 -0.7902 -0.7456 -0.7447 -0.8235
18 300,000 0.3 0.010 1 -0.2217 0.5025 0.5006 -0.3896 -0.2382 -0.3085 0.0459 -0.3534 -0.4820
19 30,000 0.8 0.010 1 -0.5748 -0.2451 -0.2437 -0.4588 -0.5092 -0.5084 -0.3922 -0.5315 -0.7202
20 300,000 0.8 0.010 1 -0.0721 0.8520 0.8509 -0.3528 -0.1724 -0.2536 0.2598 -0.2785 -0.3835
21 30,000 0.3 0.001 1 -0.1841 0.4200 0.4012 -0.3765 -0.2425 -0.2943 0.0090 -0.3219 -0.4077
22 300,000 0.3 0.001 1 0.0408 0.9783 0.9577 -0.3292 -0.2389 -0.2900 0.3544 -0.2171 -0.2942
23 30,000 0.8 0.001 1 0.0218 0.7124 0.7085 -0.3422 -0.1952 -0.2712 0.1882 -0.2674 -0.3421
24 300,000 0.8 0.001 1 0.1391 0.9378 0.9269 -0.3473 -0.2514 -0.2974 0.3210 -0.2290 -0.2233

\(~\) \(~\) \(~\) \(~\)

The mean relative change in average MSE over all significant SNPs across the 24 scenarios is obtained for each method:

##            EB           rep        UMVCUE       cl1_com       cl2_MLE 
## -0.2073875000  0.4221666667  0.4352541667 -0.3878333333 -0.3505750000 
##       cl3_MSE       MSE_min    MSE_min_sp        EB_com 
## -0.3931958333 -0.0004333333 -0.3801500000 -0.4865500000

The methods are ranked according to the results above in ascending order:

##         EB        rep     UMVCUE    cl1_com    cl2_MLE    cl3_MSE    MSE_min 
##          6          8          9          3          5          2          7 
## MSE_min_sp     EB_com 
##          4          1

\(~\) \(~\) \(~\) \(~\)

Relative change in average MSE over all significant SNPs due to method implementation with a replication GWAS 50% of the size of the discovery GWAS, using a significance threshold of \(5 \times 10^{-8}\):

It is very interesting here how the empirical Bayes method behaves better in nearly all of the scenarios compared to UMVCUE, the replication method and the original MSE minimization method.

\(~\) \(~\) \(~\) \(~\)

Next, we look at how the methods compare when the replication data set is 10% that of the discovery data set.

Summary of results for rel_mse contained in replicate_norm_5e-8_10sim_10pc.csv:

Scenario n_samples h2 prop_effect S EB rep UMVCUE cl1_com cl2_MLE cl3_MSE MSE_min MSE_min_sp EB_com
1 30,000 0.3 0.010 -1 -0.6882 -0.8459 -0.1347 -0.1347 -0.8650 -0.9098 -0.7054 -0.7148 -0.7130
2 300,000 0.3 0.010 -1 -0.3039 6.1179 6.1293 -0.1142 0.9608 0.7682 3.2779 1.1001 -0.3450
3 30,000 0.8 0.010 -1 -0.6578 1.1185 0.9389 -0.1447 -0.2970 -0.2879 0.3532 -0.0834 -0.7412
4 300,000 0.8 0.010 -1 -0.0822 8.5752 8.5514 -0.0966 0.7102 0.5184 4.8495 1.6656 -0.1624
5 30,000 0.3 0.001 -1 -0.1343 6.6803 6.3721 -0.1469 1.0660 0.8077 3.6450 1.2937 -0.2703
6 300,000 0.3 0.001 -1 0.0362 9.4399 9.2797 -0.0831 0.3385 0.2465 5.4308 1.9488 -0.0391
7 30,000 0.8 0.001 -1 0.0393 8.1802 8.0087 -0.1144 0.5484 0.3722 4.5871 1.5671 -0.1233
8 300,000 0.8 0.001 -1 0.0787 8.9488 8.7408 -0.0805 0.1317 0.0725 5.0293 1.8260 -0.0071
9 30,000 0.3 0.010 0 -0.6875 2.0066 0.1214 -0.0302 -0.6321 -0.5416 1.0462 0.5164 -0.7038
10 300,000 0.3 0.010 0 -0.2877 5.8715 5.8446 -0.1192 0.6401 0.4860 3.1860 1.0214 -0.3552
11 30,000 0.8 0.010 0 -0.6383 1.7832 1.6884 -0.1457 -0.1774 -0.1826 0.7831 0.1045 -0.6829
12 300,000 0.8 0.010 0 -0.0966 7.7234 7.7151 -0.0978 0.6801 0.5023 4.2934 1.4651 -0.1789
13 30,000 0.3 0.001 0 -0.1844 6.0239 5.8151 -0.1267 0.6105 0.4558 3.2929 1.1285 -0.3026
14 300,000 0.3 0.001 0 0.0482 9.1264 9.0215 -0.0943 0.4762 0.3493 5.1177 1.8393 -0.0418
15 30,000 0.8 0.001 0 0.0058 8.3434 8.1709 -0.1224 0.6860 0.5075 4.7389 1.6262 -0.1068
16 300,000 0.8 0.001 0 0.1245 8.6361 8.4906 -0.0920 0.2454 0.1596 4.8462 1.7429 0.0438
17 30,000 0.3 0.010 1 -0.6981 0.1675 0.4949 -0.1074 -0.6723 -0.6461 -0.0359 -0.1945 -0.7822
18 300,000 0.3 0.010 1 -0.2217 6.5131 6.4877 -0.1123 0.6752 0.5065 3.6019 1.1830 -0.2980
19 30,000 0.8 0.010 1 -0.5748 2.7767 2.7441 -0.1316 0.1551 0.1108 1.4722 0.4045 -0.6227
20 300,000 0.8 0.010 1 -0.0721 8.2604 8.2430 -0.0995 0.6564 0.4842 4.6253 1.5888 -0.1579
21 30,000 0.3 0.001 1 -0.1841 6.1040 5.7746 -0.0971 0.6218 0.4984 3.3648 1.3257 -0.1504
22 300,000 0.3 0.001 1 0.0408 8.8921 8.6013 -0.0864 0.3156 0.1999 5.0534 1.8168 -0.0489
23 30,000 0.8 0.001 1 0.0218 7.5668 7.4343 -0.0835 0.6294 0.4626 4.2365 1.6010 -0.0570
24 300,000 0.8 0.001 1 0.1391 8.6897 8.4694 -0.0993 0.2762 0.1855 4.9102 1.7712 0.0610

\(~\) \(~\) \(~\) \(~\)

The mean relative change in average MSE over all significant SNPs across the 24 scenarios is obtained for each method:

##         EB        rep     UMVCUE    cl1_com    cl2_MLE    cl3_MSE    MSE_min 
## -0.2073875  6.1124875  5.9584625 -0.1066875  0.3241583  0.2135792  3.3750083 
## MSE_min_sp     EB_com 
##  1.1476625 -0.2827375

The methods are ranked according to the results above in ascending order:

##         EB        rep     UMVCUE    cl1_com    cl2_MLE    cl3_MSE    MSE_min 
##          2          9          8          3          5          4          7 
## MSE_min_sp     EB_com 
##          6          1

\(~\) \(~\) \(~\) \(~\)

Relative change in average MSE over all significant SNPs due to method implementation with a replication GWAS 10% of the size of the discovery GWAS, using a significance threshold of \(5 \times 10^{-8}\):

In this situation, the empirical Bayes method tends to behave better than all of the other methods.

\(~\) \(~\) \(~\) \(~\)

Following this, we look at how the methods compare when the replication data set is twice that of the discovery data set.

Summary of results for rel_mse contained in replicate_norm_5e-8_10sim_2.csv:

Scenario n_samples h2 prop_effect S EB rep UMVCUE cl1_com cl2_MLE cl3_MSE MSE_min MSE_min_sp EB_com
1 30,000 0.3 0.010 -1 -0.6882 -0.9923 -0.8629 -0.8629 -0.9979 -0.9828 -0.9774 -0.9822 -0.9935
2 300,000 0.3 0.010 -1 -0.3039 -0.6441 -0.6438 -0.7304 -0.7095 -0.7295 -0.7015 -0.7108 -0.7604
3 30,000 0.8 0.010 -1 -0.6578 -0.8941 -0.8957 -0.8350 -0.9041 -0.8895 -0.8862 -0.8736 -0.9258
4 300,000 0.8 0.010 -1 -0.0822 -0.5213 -0.5215 -0.6774 -0.6518 -0.6725 -0.6112 -0.6494 -0.6866
5 30,000 0.3 0.001 -1 -0.1343 -0.6162 -0.6178 -0.7405 -0.7020 -0.7322 -0.6741 -0.7284 -0.7285
6 300,000 0.3 0.001 -1 0.0362 -0.4780 -0.4783 -0.6544 -0.6388 -0.6517 -0.5765 -0.6273 -0.6429
7 30,000 0.8 0.001 -1 0.0393 -0.5413 -0.5425 -0.6951 -0.6808 -0.6972 -0.6268 -0.6644 -0.6942
8 300,000 0.8 0.001 -1 0.0787 -0.5026 -0.5039 -0.6613 -0.6401 -0.6537 -0.5924 -0.6270 -0.6337
9 30,000 0.3 0.010 0 -0.6875 -0.8498 -0.7796 -0.7509 -0.8855 -0.8361 -0.8018 -0.8150 -0.9176
10 300,000 0.3 0.010 0 -0.2877 -0.6564 -0.6568 -0.7377 -0.7301 -0.7429 -0.7116 -0.7238 -0.7674
11 30,000 0.8 0.010 0 -0.6383 -0.8609 -0.8620 -0.8242 -0.8862 -0.8707 -0.8582 -0.8666 -0.9065
12 300,000 0.8 0.010 0 -0.0966 -0.5639 -0.5639 -0.6929 -0.6728 -0.6901 -0.6397 -0.6592 -0.7026
13 30,000 0.3 0.001 0 -0.1844 -0.6490 -0.6526 -0.7397 -0.7295 -0.7422 -0.7074 -0.7130 -0.7541
14 300,000 0.3 0.001 0 0.0482 -0.4937 -0.4933 -0.6665 -0.6437 -0.6608 -0.5964 -0.6374 -0.6466
15 30,000 0.8 0.001 0 0.0058 -0.5331 -0.5354 -0.6971 -0.6593 -0.6849 -0.6231 -0.6689 -0.6918
16 300,000 0.8 0.001 0 0.1245 -0.5182 -0.5174 -0.6736 -0.6495 -0.6636 -0.6068 -0.6400 -0.6349
17 30,000 0.3 0.010 1 -0.6981 -0.9417 -0.9185 -0.8290 -0.9463 -0.9266 -0.9253 -0.9208 -0.9243
18 300,000 0.3 0.010 1 -0.2217 -0.6244 -0.6246 -0.7225 -0.7079 -0.7230 -0.6841 -0.7075 -0.7447
19 30,000 0.8 0.010 1 -0.5748 -0.8113 -0.8104 -0.7986 -0.8331 -0.8262 -0.8150 -0.8315 -0.8627
20 300,000 0.8 0.010 1 -0.0721 -0.5370 -0.5371 -0.6846 -0.6581 -0.6782 -0.6231 -0.6556 -0.6901
21 30,000 0.3 0.001 1 -0.1841 -0.6450 -0.6457 -0.7205 -0.7110 -0.7220 -0.6916 -0.7046 -0.7340
22 300,000 0.3 0.001 1 0.0408 -0.5054 -0.5074 -0.6658 -0.6537 -0.6644 -0.5972 -0.6386 -0.6339
23 30,000 0.8 0.001 1 0.0218 -0.5719 -0.5715 -0.6871 -0.6713 -0.6876 -0.6415 -0.6541 -0.6870
24 300,000 0.8 0.001 1 0.1391 -0.5155 -0.5158 -0.6771 -0.6468 -0.6640 -0.6065 -0.6454 -0.6055

\(~\) \(~\) \(~\) \(~\)

The mean relative change in average MSE over all significant SNPs across the 24 scenarios is obtained for each method:

##         EB        rep     UMVCUE    cl1_com    cl2_MLE    cl3_MSE    MSE_min 
## -0.2073875 -0.6444625 -0.6357667 -0.7260333 -0.7337417 -0.7413500 -0.6989750 
## MSE_min_sp     EB_com 
## -0.7227125 -0.7487208

The methods are ranked according to the results above in ascending order:

##         EB        rep     UMVCUE    cl1_com    cl2_MLE    cl3_MSE    MSE_min 
##          9          7          8          4          3          2          6 
## MSE_min_sp     EB_com 
##          5          1

\(~\) \(~\) \(~\) \(~\)

Relative change in average MSE over all significant SNPs due to method implementation with a replication GWAS twice the size of the discovery GWAS, using a significance threshold of \(5 \times 10^{-8}\):

\(~\) \(~\) \(~\) \(~\)

As a final step in this section, we investigate the behaviour of methods when the replication and discovery GWASs are of equal size and the effect sizes follow the previously described bimodal distribution.

Summary of results for rel_mse contained in replicate_bim_5e-8_10sim.csv:

Scenario n_samples h2 prop_effect S EB rep UMVCUE cl1_com cl2_MLE cl3_MSE MSE_min MSE_min_sp EB_com
1 30,000 0.3 0.010 -1 -0.8882 -0.8792 -0.6408 -0.6408 -0.9209 -0.8815 -0.8318 -0.8582 -0.9102
2 300,000 0.3 0.010 -1 -0.3975 -0.3761 -0.3765 -0.5983 -0.5435 -0.5804 -0.5153 -0.6131 -0.6815
3 30,000 0.8 0.010 -1 -0.7891 -0.7858 -0.7865 -0.6862 -0.8233 -0.8046 -0.7840 -0.8014 -0.9024
4 300,000 0.8 0.010 -1 -0.1208 -0.0939 -0.0939 -0.5282 -0.4397 -0.4911 -0.3279 -0.5289 -0.5564
5 30,000 0.3 0.001 -1 -0.1081 -0.3742 -0.3740 -0.6430 -0.5587 -0.6026 -0.5154 -0.6541 -0.6748
6 300,000 0.3 0.001 -1 0.0300 0.0255 0.0225 -0.4892 -0.4435 -0.4747 -0.2444 -0.4875 -0.4730
7 30,000 0.8 0.001 -1 -0.0637 -0.1305 -0.1358 -0.5422 -0.4806 -0.5245 -0.3539 -0.5409 -0.5623
8 300,000 0.8 0.001 -1 0.0520 0.0253 0.0233 -0.4864 -0.4564 -0.4761 -0.2434 -0.4789 -0.4368
9 30,000 0.3 0.010 0 -0.9917 -0.9110 -0.8431 -0.7610 -0.9039 -0.9125 -0.9088 -0.9245 -0.9346
10 300,000 0.3 0.010 0 -0.3946 -0.3201 -0.3205 -0.5805 -0.5158 -0.5522 -0.4702 -0.5928 -0.6792
11 30,000 0.8 0.010 0 -0.7912 -0.8298 -0.8329 -0.6846 -0.8476 -0.8169 -0.8141 -0.8358 -0.8911
12 300,000 0.8 0.010 0 -0.1084 -0.0965 -0.0966 -0.5234 -0.4556 -0.5010 -0.3284 -0.5247 -0.5503
13 30,000 0.3 0.001 0 -0.3183 -0.2697 -0.2695 -0.5635 -0.4849 -0.5315 -0.4448 -0.5734 -0.5708
14 300,000 0.3 0.001 0 0.0285 0.0453 0.0453 -0.4857 -0.4372 -0.4652 -0.2293 -0.4827 -0.4832
15 30,000 0.8 0.001 0 -0.0257 -0.1480 -0.1512 -0.5564 -0.4796 -0.5307 -0.3657 -0.5568 -0.5448
16 300,000 0.8 0.001 0 0.0493 -0.0237 -0.0234 -0.5048 -0.4680 -0.4932 -0.2879 -0.5039 -0.4770
17 30,000 0.3 0.010 1 -0.9631 -0.9923 -0.7921 -0.7921 -1.0000 -0.9971 -0.9981 -0.9981 -0.9728
18 300,000 0.3 0.010 1 -0.4894 -0.3632 -0.3633 -0.5845 -0.5395 -0.5635 -0.4942 -0.6034 -0.7141
19 30,000 0.8 0.010 1 -0.8711 -0.8479 -0.8470 -0.7067 -0.8694 -0.8515 -0.8344 -0.8599 -0.9432
20 300,000 0.8 0.010 1 -0.1136 -0.0642 -0.0642 -0.5173 -0.4323 -0.4841 -0.3065 -0.5182 -0.5436
21 30,000 0.3 0.001 1 -0.3030 -0.4114 -0.4096 -0.6064 -0.5620 -0.5871 -0.5336 -0.6207 -0.7063
22 300,000 0.3 0.001 1 0.0525 -0.0149 -0.0135 -0.4904 -0.4630 -0.4829 -0.2685 -0.4885 -0.4817
23 30,000 0.8 0.001 1 0.0926 -0.0880 -0.0854 -0.5210 -0.4417 -0.4962 -0.3190 -0.5255 -0.4920
24 300,000 0.8 0.001 1 0.0418 0.0219 0.0214 -0.4991 -0.4587 -0.4843 -0.2522 -0.4980 -0.4884

\(~\) \(~\) \(~\) \(~\)

The mean relative change in average MSE over all significant SNPs across the 24 scenarios is obtained for each method:

##         EB        rep     UMVCUE    cl1_com    cl2_MLE    cl3_MSE    MSE_min 
## -0.3079500 -0.3292667 -0.3086375 -0.5829875 -0.5844083 -0.6077250 -0.4863250 
## MSE_min_sp     EB_com 
## -0.6279125 -0.6529375

The methods are ranked according to the results above in ascending order:

##         EB        rep     UMVCUE    cl1_com    cl2_MLE    cl3_MSE    MSE_min 
##          9          7          8          5          4          3          6 
## MSE_min_sp     EB_com 
##          2          1

\(~\) \(~\) \(~\) \(~\)

Relative change in average MSE over all significant SNPs due to method implementation with a replication and discovery GWAS of equal size, using a significance threshold of \(5 \times 10^{-8}\) and a bimodal distribution of effect sizes:

\(~\) \(~\) \(~\) \(~\)

7. Winner’s Curse and Binary Traits

Under the same 24 scenarios, we investigate simulating summary statistics for a binary trait with disease prevalence of 0.1 and the corresponding performance of the various Winner’s Curse adjustment methods. Below is a summary of the results contained in binary_norm_nsig_prop_bias_5e-8_10sim.csv and binary_norm_nsig_prop_bias_5e-4_10sim.csv. Compared to the quantitative trait above, we witness a lot less SNPs meeting the significance threshold of 5e-8 under many of the scenarios.

Scenario n_samples h2 prop_effect S n_sig 5e-8 prop_bias 5e-8 prop_x 5e-8 mse 5e-8 n_sig 5e-4 prop_bias 5e-4 prop_x 5e-4 mse 5e-4 sd(n_sig) 5e-8 sd(prop_bias) 5e-8 sd(prop_x) 5e-8 sd(mse) 5e-8 sd(n_sig) 5e-4 sd(prop_bias) 5e-4 sd(prop_x) 5e-4 sd(mse) 5e-4
1 30,000 0.3 0.010 -1 0.1 1.0000 1.0000 0.043205 503.2 1.0000 1.0000 0.024306 0.316 NA NA NA 19.384 0.0000 0.0000 0.001972
2 300,000 0.3 0.010 -1 0.4 1.0000 1.0000 0.001337 563.2 1.0000 0.9590 0.002258 0.516 0.0000 0.0000 0.000347 23.748 0.0000 0.0107 0.000134
3 30,000 0.8 0.010 -1 2.3 1.0000 0.6667 0.018844 687.4 0.9960 0.8229 0.019493 1.337 0.0000 0.2976 0.009237 17.658 0.0028 0.0173 0.001161
4 300,000 0.8 0.010 -1 1353.9 0.7098 0.0732 0.000215 3870.3 0.7067 0.1751 0.000481 16.749 0.0106 0.0058 0.000011 44.672 0.0067 0.0044 0.000016
5 30,000 0.3 0.001 -1 0.2 1.0000 1.0000 0.023959 510.7 1.0000 0.9934 0.024185 0.422 0.0000 0.0000 0.011519 19.568 0.0000 0.0029 0.000956
6 300,000 0.3 0.001 -1 51.5 0.7965 0.1205 0.000298 713.3 0.9227 0.7224 0.001781 5.339 0.0565 0.0552 0.000110 28.960 0.0073 0.0099 0.000106
7 30,000 0.8 0.001 -1 130.2 0.7189 0.0761 0.002359 834.1 0.8651 0.6177 0.015121 7.657 0.0337 0.0238 0.000643 28.097 0.0080 0.0168 0.000689
8 300,000 0.8 0.001 -1 626.3 0.5347 0.0336 0.000186 1251.8 0.7170 0.4159 0.001057 17.982 0.0182 0.0052 0.000015 30.025 0.0117 0.0109 0.000059
9 30,000 0.3 0.010 0 0.1 1.0000 1.0000 0.031993 498.8 1.0000 1.0000 0.024378 0.316 NA NA NA 16.518 0.0000 0.0000 0.000919
10 300,000 0.3 0.010 0 0.3 1.0000 1.0000 0.001994 580.4 0.9998 0.9448 0.002149 0.483 0.0000 0.0000 0.002061 31.178 0.0005 0.0120 0.000140
11 30,000 0.8 0.010 0 5.3 0.9833 0.4772 0.003954 718.3 0.9925 0.7883 0.017537 2.111 0.0527 0.2430 0.001587 25.755 0.0032 0.0111 0.000823
12 300,000 0.8 0.010 0 1338.0 0.6829 0.0664 0.000120 3678.1 0.7050 0.1873 0.000463 22.376 0.0124 0.0047 0.000006 20.739 0.0059 0.0066 0.000024
13 30,000 0.3 0.001 0 0.0 NA NA NA 514.1 1.0000 0.9941 0.023966 0.000 NA NA NA 11.930 0.0000 0.0033 0.000874
14 300,000 0.3 0.001 0 56.4 0.7825 0.1075 0.000141 703.9 0.9209 0.7296 0.001785 5.661 0.0628 0.0393 0.000030 24.016 0.0068 0.0113 0.000134
15 30,000 0.8 0.001 0 130.3 0.6866 0.0622 0.001186 816.3 0.8683 0.6312 0.015436 8.111 0.0497 0.0275 0.000160 16.180 0.0073 0.0160 0.000864
16 300,000 0.8 0.001 0 585.6 0.5541 0.0330 0.000133 1218.8 0.7286 0.4246 0.001058 15.785 0.0179 0.0082 0.000013 20.115 0.0112 0.0155 0.000067
17 30,000 0.3 0.010 1 0.1 1.0000 1.0000 0.029079 505.2 1.0000 1.0000 0.024162 0.316 NA NA NA 15.259 0.0000 0.0000 0.001306
18 300,000 0.3 0.010 1 0.6 1.0000 0.8750 0.001202 580.7 0.9991 0.9277 0.002092 0.843 0.0000 0.2500 0.000601 20.907 0.0012 0.0087 0.000144
19 30,000 0.8 0.010 1 8.0 0.9546 0.3713 0.003110 730.7 0.9851 0.7653 0.017206 2.582 0.0761 0.2686 0.001241 26.779 0.0045 0.0202 0.001488
20 300,000 0.8 0.010 1 1343.7 0.6683 0.0574 0.000101 3445.6 0.6977 0.1899 0.000435 26.030 0.0139 0.0062 0.000003 42.717 0.0110 0.0075 0.000024
21 30,000 0.3 0.001 1 0.1 1.0000 1.0000 0.028158 502.1 0.9998 0.9924 0.023958 0.316 NA NA NA 22.951 0.0006 0.0037 0.001363
22 300,000 0.3 0.001 1 64.5 0.7587 0.0862 0.000115 697.8 0.9203 0.7345 0.001734 6.654 0.0399 0.0416 0.000032 28.142 0.0122 0.0132 0.000078
23 30,000 0.8 0.001 1 134.2 0.6752 0.0730 0.001053 793.5 0.8737 0.6537 0.016008 5.903 0.0244 0.0277 0.000162 21.629 0.0085 0.0167 0.000750
24 300,000 0.8 0.001 1 532.7 0.5474 0.0316 0.000106 1159.1 0.7369 0.4396 0.001107 12.763 0.0103 0.0064 0.000010 18.806 0.0080 0.0085 0.000067

\(~\) \(~\) \(~\) \(~\)

We plot \(z\) vs \(\text{bias}\) for the 24 different scenarios with a simulated binary trait. Similar to above, the points corresponding to SNPs which are significantly biased and are significant at a threshold of \(5 \times 10^{-4}\) are coloured in navy .

\(~\) \(~\) \(~\) \(~\)

\(~\) \(~\) \(~\) \(~\)

Summary of results for rel_mse contained in binary_norm_5e-8_10sim.csv:

Scenario n_samples h2 prop_effect S EB FIQT BR cl1 cl2 cl3 rep
1 30,000 0.3 0.010 -1 -0.8321 -0.7771 -0.9121 -0.6689 -0.6791 -0.6746 -0.9954
2 300,000 0.3 0.010 -1 -0.9371 -0.9499 -0.8904 -0.7777 -0.9229 -0.8986 -0.9533
3 30,000 0.8 0.010 -1 -0.7689 -0.7803 -0.7013 -0.0147 -0.7514 -0.4770 -0.8230
4 300,000 0.8 0.010 -1 -0.2148 0.1763 -0.0974 2.4029 1.1847 1.6734 -0.2074
5 30,000 0.3 0.001 -1 -0.8933 -0.8504 -0.9085 -0.9945 -0.8166 -0.9368 -0.9954
6 300,000 0.3 0.001 -1 -0.2884 0.6545 0.4088 2.4080 0.9838 1.5630 -0.3425
7 30,000 0.8 0.001 -1 -0.1623 1.0327 0.6538 3.7188 1.7049 2.5403 0.0034
8 300,000 0.8 0.001 -1 0.0580 0.3212 0.6455 0.7320 0.6382 0.6034 -0.0108
9 30,000 0.3 0.010 0 -0.8812 -0.8610 -0.9138 -0.9951 -0.8308 -0.9421 -0.9792
10 300,000 0.3 0.010 0 -0.9555 -0.9388 -0.9438 -0.7604 -0.8773 -0.8638 -0.8946
11 30,000 0.8 0.010 0 -0.6630 -0.6213 -0.4645 0.6283 -0.5625 -0.0915 -0.7332
12 300,000 0.8 0.010 0 -0.2002 0.1320 -0.0877 2.1744 1.0360 1.4948 -0.2334
13 30,000 0.3 0.001 0 NA NA NA NA NA NA NA
14 300,000 0.3 0.001 0 -0.2525 0.8169 0.5802 2.9517 1.2249 1.9364 -0.2569
15 30,000 0.8 0.001 0 -0.1714 0.6272 0.4434 2.2274 1.1211 1.5730 -0.2541
16 300,000 0.8 0.001 0 0.0511 0.4392 0.7611 1.2359 0.7964 0.9527 0.0005
17 30,000 0.3 0.010 1 -0.8904 -0.8600 -0.9145 -0.9951 -0.8296 -0.9416 -0.9994
18 300,000 0.3 0.010 1 -0.6472 -0.7991 -0.4708 -0.0551 -0.8710 -0.5756 -0.8783
19 30,000 0.8 0.010 1 -0.7983 -0.7001 -0.6627 0.2694 -0.5722 -0.2399 -0.7207
20 300,000 0.8 0.010 1 -0.1584 0.1997 -0.0140 2.1550 1.0719 1.5083 -0.1636
21 30,000 0.3 0.001 1 -0.8899 -0.8676 -0.9156 -0.9954 -0.8391 -0.9451 -0.9961
22 300,000 0.3 0.001 1 -0.1822 0.8427 0.6334 2.7302 1.2514 1.8581 -0.3071
23 30,000 0.8 0.001 1 -0.0358 0.7604 0.5807 2.4714 1.1717 1.6978 -0.0967
24 300,000 0.8 0.001 1 0.0415 0.3314 0.6490 0.9037 0.7276 0.7336 -0.0583

\(~\) \(~\) \(~\) \(~\)

The mean relative change in average MSE over all significant SNPs across the 24 scenarios is obtained for each method:

##         EB       FIQT         BR        cl1        cl2        cl3        rep 
## -0.4640130 -0.1161478 -0.1539652  0.9022696  0.1895696  0.4586174 -0.5171957

The methods are ranked according to the results above in ascending order:

##   EB FIQT   BR  cl1  cl2  cl3  rep 
##    2    4    3    7    5    6    1

\(~\) \(~\) \(~\) \(~\)

Relative change in average MSE over all significant SNPs due to method implementation for a binary trait, using a significance threshold of \(5 \times 10^{-8}\):

\(~\) \(~\) \(~\) \(~\)

Note the absence of data for Scenario 13 in the above plot which is due to the fact that across all 10 simulations, no SNPs reached the significance threshold. Refer back to the first table in this section in order to gain an insight into the expected number of significant SNPs under each scenario.

\(~\) \(~\) \(~\) \(~\)

Summary of results for rel_mse contained in binary_norm_5e-4_10sim.csv:

Scenario n_samples h2 prop_effect S EB FIQT BR cl1 cl2 cl3 rep
1 30,000 0.3 0.010 -1 -0.9941 -0.9988 -0.9147 -0.8562 -0.6989 -0.7985 -0.9300
2 300,000 0.3 0.010 -1 -0.9753 -0.9777 -0.9085 -0.8461 -0.7023 -0.7959 -0.9218
3 30,000 0.8 0.010 -1 -0.9080 -0.9117 -0.8659 -0.8112 -0.6997 -0.7788 -0.9099
4 300,000 0.8 0.010 -1 -0.3577 -0.1615 -0.2376 -0.1789 -0.3066 -0.2761 -0.6454
5 30,000 0.3 0.001 -1 -0.9907 -0.9943 -0.9130 -0.8427 -0.6935 -0.7882 -0.9280
6 300,000 0.3 0.001 -1 -0.8308 -0.8050 -0.7557 -0.7759 -0.6612 -0.7417 -0.9043
7 30,000 0.8 0.001 -1 -0.8103 -0.7525 -0.7035 -0.7686 -0.6481 -0.7314 -0.8896
8 300,000 0.8 0.001 -1 -0.7437 -0.6824 -0.5465 -0.7248 -0.5879 -0.6805 -0.8323
9 30,000 0.3 0.010 0 -0.9954 -0.9987 -0.9134 -0.8510 -0.6963 -0.7942 -0.9299
10 300,000 0.3 0.010 0 -0.9755 -0.9775 -0.9024 -0.8420 -0.6984 -0.7915 -0.9275
11 30,000 0.8 0.010 0 -0.9191 -0.9307 -0.8659 -0.8246 -0.6966 -0.7831 -0.9154
12 300,000 0.8 0.010 0 -0.4626 -0.3698 -0.3970 -0.3819 -0.4431 -0.4437 -0.7320
13 30,000 0.3 0.001 0 -0.9932 -0.9965 -0.9120 -0.8503 -0.6966 -0.7943 -0.9299
14 300,000 0.3 0.001 0 -0.8928 -0.8802 -0.8046 -0.8170 -0.6802 -0.7708 -0.9158
15 30,000 0.8 0.001 0 -0.8400 -0.8168 -0.7571 -0.7792 -0.6582 -0.7405 -0.9031
16 300,000 0.8 0.001 0 -0.7472 -0.6972 -0.5741 -0.7391 -0.6133 -0.6978 -0.8599
17 30,000 0.3 0.010 1 -0.9962 -0.9989 -0.9136 -0.8505 -0.6966 -0.7940 -0.9294
18 300,000 0.3 0.010 1 -0.9709 -0.9739 -0.9010 -0.8378 -0.6965 -0.7881 -0.9234
19 30,000 0.8 0.010 1 -0.9072 -0.9181 -0.8523 -0.8179 -0.6914 -0.7770 -0.9155
20 300,000 0.8 0.010 1 -0.5086 -0.4340 -0.4411 -0.4843 -0.4899 -0.5149 -0.7707
21 30,000 0.3 0.001 1 -0.9913 -0.9961 -0.9128 -0.8582 -0.7000 -0.8001 -0.9274
22 300,000 0.3 0.001 1 -0.8968 -0.8857 -0.8094 -0.8129 -0.6767 -0.7661 -0.9155
23 30,000 0.8 0.001 1 -0.8700 -0.8444 -0.7759 -0.7960 -0.6664 -0.7523 -0.9048
24 300,000 0.8 0.001 1 -0.7955 -0.7539 -0.6205 -0.7730 -0.6339 -0.7268 -0.8704

\(~\) \(~\) \(~\) \(~\)

The mean relative change in average MSE over all significant SNPs across the 24 scenarios is obtained for each method:

##         EB       FIQT         BR        cl1        cl2        cl3        rep 
## -0.8488708 -0.8231792 -0.7582708 -0.7550042 -0.6430125 -0.7219292 -0.8846625

The methods are ranked according to the results above in ascending order:

##   EB FIQT   BR  cl1  cl2  cl3  rep 
##    2    3    4    5    7    6    1

\(~\) \(~\) \(~\) \(~\)

Relative change in average MSE over all significant SNPs due to method implementation for a binary trait, using a significance threshold of \(5 \times 10^{-4}\):

8. Comparison of the Empirical Bayes method with the true Bayes rule

Summary of results for rel_mse contained in norm_5e-8_10sim_bayes_compare.csv:

Scenario n_samples h2 prop_effect S EB bayes rep
1 30,000 0.3 0.010 -1 -0.6882 -0.7705 -0.9846
2 300,000 0.3 0.010 -1 -0.3039 -0.3272 -0.2882
3 30,000 0.8 0.010 -1 -0.6578 -0.8129 -0.7883
4 300,000 0.8 0.010 -1 -0.0822 -0.0992 -0.0425
5 30,000 0.3 0.001 -1 -0.1343 -0.2759 -0.2324
6 300,000 0.3 0.001 -1 0.0362 -0.0369 0.0439
7 30,000 0.8 0.001 -1 0.0393 -0.1207 -0.0825
8 300,000 0.8 0.001 -1 0.0787 -0.0273 -0.0052
9 30,000 0.3 0.010 0 -0.6875 -0.9251 -0.6995
10 300,000 0.3 0.010 0 -0.2877 -0.3153 -0.3129
11 30,000 0.8 0.010 0 -0.6383 -0.7271 -0.7218
12 300,000 0.8 0.010 0 -0.0966 -0.1113 -0.1277
13 30,000 0.3 0.001 0 -0.1844 -0.2828 -0.2980
14 300,000 0.3 0.001 0 0.0482 -0.0463 0.0126
15 30,000 0.8 0.001 0 0.0058 -0.1168 -0.0662
16 300,000 0.8 0.001 0 0.1245 -0.0241 -0.0364
17 30,000 0.3 0.010 1 -0.6981 -0.9261 -0.8833
18 300,000 0.3 0.010 1 -0.2217 -0.2493 -0.2487
19 30,000 0.8 0.010 1 -0.5748 -0.6841 -0.6226
20 300,000 0.8 0.010 1 -0.0721 -0.0904 -0.0740
21 30,000 0.3 0.001 1 -0.1841 -0.3184 -0.2900
22 300,000 0.3 0.001 1 0.0408 -0.0386 -0.0108
23 30,000 0.8 0.001 1 0.0218 -0.1025 -0.1438
24 300,000 0.8 0.001 1 0.1391 -0.0301 -0.0311

\(~\) \(~\) \(~\) \(~\)

The mean relative change in average MSE over all significant SNPs across the 24 scenarios is obtained for each method:

##         EB      bayes        rep 
## -0.2073875 -0.3107875 -0.2889167

The methods are ranked according to the results above in ascending order:

##    EB bayes   rep 
##     3     1     2

\(~\) \(~\) \(~\) \(~\)

Relative change in average MSE over all significant SNPs due to method implementation, using a significance threshold of \(5 \times 10^{-8}\):

\(~\) \(~\) \(~\) \(~\)